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ABSTRACT 

When designing products, it is crucial to assure failure and risk-free operation in the intended operating 
environment. Failures are typ ically studied and eliminated as much as possible during the early stages of 
design. The few failures that go undetected result in unacceptable damage and losses in high-risk applica- 
tions where public safety is of concern. Published NASA and NTSB accident reports point to a variety of 
components identified as sources of failures in the reported cases. In previous work, data from these reports 
were processed and placed in matrix form for all the system components and failure modes encountered, 
and then manipulated using matrix methods to determine similarities between the different components and 
failure modes. In this paper, Ihese matrices are represented in the form of a linear combination of failures 
modes, mathematically formed using Principal Components Analysis (PCA) decomposition. The PCA de- 
composition results in a low-dimensionality representation of all failure modes and components of interest, 
represented in a transformed coordinate system. Such a representation opens the way for efficient pattern 
analysis and prediction of failure modes with highest potential risks on the final product, rather than making 
decisions based on the large space of component and failure mode data. The mathematics of the proposed 
method are explained first using a simple example problem. The method is then applied to component failure 
data gathered from helicopter accident reports to demonstrate its potential. 


•Corresponding author. Submitted for review, Journal of Mechanical Design. 


1 



KEYWORDS 

Failure-free product development; Failure prevention; Function-failure similarity analysis; Principal components 
analysis; Risk-based design; Failure mode identification. 


PRELIMINARIES 

Prevention of potential failure modes during product development is especially crucial in high-risk aerospace 
applications, where failures are unacceptable at any frequency. Failures modes are analyzed thoroughly during the 
early stages of design to prevent occurrence during operation. However, the number of failure modes for each of 
the components that make up a system can be overwhelming when predictions are performed, especially in complex 
systems. In our work, component, failure, and functionality information is derived from engineering drawings and 
specifications, accident reports, arid functional bases, to establish a link between functionality of components and the 
potential failure modes (Collins a id Hagan, 1976; Harris et al., 2000; NTSB, 2001; Stone and Wood, 2000; Shafer, 
1980). This information has been used by the authors to draw similarities between different designs (Turner and Stone, 
2001; Roberts et al., 2002; Turner et al., 2002) using matrix manipulations of the component, failure, and functionality 
data. The overall goal of our work is to address the failure modes early in conceptual design: to achieve this goal, 
functions are mapped to failure modes that are experienced by a component that performs the particular functions 
(Turner and Stone, 2001; Roberts et al., 2002; Turner et al., 2002). 

In the current paper, we present a means of decomposing large design problems for failure analysis and prevention 
purposes. In the case of complex engineering systems, the number of components and their interactions with each 
other, as well as their interactions with the operating environment, can be overwhelmingly large. Working from 
the original component-failure, component-function, and failure-function matrices can be especially difficult when 
predictions need to be made to determine safety, performance, and the associated risks. To address this problem, 
in this paper, we present an insigitful approach to decompose the initial matrices (derived for the function-failure 
similarity analysis) and derive a low-dimensional representation of the large space of components, failure modes, and 
functions of relevance. Specifical y, this paper proposes a decomposition method which reduces the dimensionality 
of the function-component-failure space by means of an orthogonal transformation. The focus is on components and 
their failure modes, where each component is related to the potential failure modes. The orthogonal decomposition 
provides a method of determining the failure modes that have the most impact, as well as the failure modes that are 
redundant in the information that they provide. During the early design stages, the failure modes with more potential 
may be concentrated on in order to reduce risk, as well as reduce design time and cost. Using such a decomposition 
approach, the designer can focus on the failure modes that have the potential of becoming a risk factor during the 
lifecycle of the complex system under investigation. 

Failure Prevention and Reliability for Design 

Reliability, maintainability, and effectiveness of machines and systems depend heavily upon the understanding, 
recognition, and prevention or elimination of mechanical failures (Collins and Hagan, 1976). The quality of a particular 
design depends heavily on the ability of the product to function in the given lifecycle, as defined by the customer or 
user of the product (Ruff and Paasch, 1993). As products become more complex, prevention of failure modes through 
analysis in the early stages of design becomes very complex and cumbersome. 

For applications such as aircraft, the risks associated with missed failures is very high: not only is safety a major 
issue due to high probability of fatalities (Harris et al., 2000), but the costs involved in repairs and downtimes can 
become a major burden. A study by Boeing Company showed that, for a fleet of 100 aircraft, the costs generated from 



delays due to aircraft failure is about $2 M per year. (This accounts for revenue loss, increased handling of passengers 
and cargo, and extra crew wages.) The cost of maintenance alone adds another $4M per year (Stander, 1982, Ruff and 
Paasch, 1993). In order to eliminate or reduce the possibility of failure, designers and manufacturing engineers need 
to be aware of all of the potentially significant failure modes in the systems being designed. 

There are several techniques of identifying failure modes, commonly used during conceptual design. Some exam- 
ples of these techniques are checklists, FMEA (failure modes and effects analysis) and FMECAs (failure modes effects 
and criticality analysis), and FTAs (fault tree analysis) (Carter, 1997; Henley and Kumamoto, 1992). The details of 
these methods are explained in (Tamer and Stone, 2001) for reference. In our work, we make use of the information 
gathered for such techniques, and combine it with information from NTSB and NASA accident reports, maintenance 
guides, and engineering specifications. This information is then presented to the designer in a form that is easy to 
analyze and use during the early stages of design. The methods developed in this work are meant to augment the 
information derived from the more traditional approaches (Turner and Stone, 2001; Roberts et al., 2002, Turner et al., 
2002 ). 

Orthogonal Decomposition for Dimensionality Reduction 

The orthogonal decomposition method proposed in this work is based on previous work reported by Turner et 
al. (Turner et ah, 2000) to extract high- variance modes from product surface profiles. This method is extended here to 
isolate the failure modes with the highest variance, to determine tradeoffs during component development and provide 
a low-dimensional representation of the significant failure modes for potential classification and prediction purposes 
(Turner and Stone, 200 1 ). 

Consider an mxn input matrix X, whose columns consist of the variables under study, and whose rows correspond 
to each observation. The nxn covariance matrix is computed by first computing the 1 x n mean vector X, removing 
the mean vector from each of the m observations, and computing — X 0 r Xo/(m - 1) (m - 1 is the rank of the 
nxn symmetric covariance matrix if rn < n, losing one additional degree of freedom due to the removal of the mean 
vector) (Fukunaga, 1990). 

The semi-positive definite symmetric covariance matrix will result in k nonnegative eigenvalues, where k is the 
rank of the matrix, determined by the number of independent rows. In this case, if m < n, and losing one degree 
of freedom by removing the mean vector, the rank k of the covariance matrix equals m- 1. The eigenvalues and 
eigenvectors of the covariance matrix are computed using the characteristic equation of the X* matrix, namely |X* — 
XI\ - 0, with the eigenvectors corresponding to two different eigenvalues and Xj being orthogonal. This equation 
can be rewritten in matrix form as X* x V = V x D, subject to the orthomormality constraint \ T x V = I, with the 
following eigenvalue (diagonal) and eigenvector matrices: 


X\ 


D — 


0 


0 

An 


\ = [ViV 2 ...V n ]. 


The eigenvector V can be used as the transformation matrix to transform the n-dimensional Xo to another vector Y 
using the orthogonal transformation Y = V r x X 0 , where the covariance matrix of Y is D (from Sr = V r x 2* x V = 

D) 

This final observation leads to several important conclusions: 1) The orthogonal transformation may be broken 
down to r separate equations y, = \'T x X; 2) Y represents X in the new coordinate system spanned by and 

hence is a coordinate transformation; 3) The transformation matrix is the eigenvector matrix of X,v- Since the eigen- 


Figure 1 . A Desktop Rotating Machinery Testrig. 


vectors are the ones that maximize the distance function d^(X), we are in effect selecting the principal components 
of the distribution as the new coordinate axes; 4) The eigenvalues are the variances of the transformed variables y,-; 5) 
Since the transformation is orthogonal. Euclidian distances are preserved, i.e., ||Y|| 2 = ||X||". When the eigenvalues 
are listed in ascending order, the resulting eigenvectors correspond to the principal components starting with the high- 
est variance, indicated by the amplitude of the corresponding eigenvalues. The input matrix can be then represented 
in this new coordinate system using the orthogonal transformation (Fukunaga, 1990). 


DECOMPOSITION-BASED FAILURE MODE IDENTIFICATION METHOD 

The orthogonal decomposition method described above is used in this work to decompose the space of failure 
modes and component functions for large complex systems. It is intended as a means to focus attention on the 
significant failure modes based on the maximum variance criterion. To explain the derivation of the eigenvectors, 
eigenvalues, and corresponding weights, a simple example problem using a rotating machinery simulator model is 
used next. 

Test Rig Example 

The simple example hypothesizes that the design of a rotating machinery test rig goes through detailed analysis 
by design engineers to assure failure and risk-free performance (Turner and Stone, 2001). The test rig design includes 
a shaft attached to a motor by means of a coupling, supported by two sets of ball bearings, which drives a gear box 
via two belts, which in turn drives a load, shown in Figure 1. This system is located at NASA Ames Research Center, 
whose purpose is to simulate vibrational fault situations (Turner and Huff, 2002). The same example was used in 
demonstrating the mechanics of the function-failure similarity method developed by Turner and Stone in (Turner and 
Stone, 2001). Some duplication of the explanation of initial matrices is allowed in this paper for clarity. 

Initial Matrices 

Three components considered in this example are: the shaft, gears, and bearings (Turner and Stone, 2001). Let C 
be an m x 1 vector of subsystems and/or components for the application domain under study (e.g., rotorcraft, aircraft, 
space spation, mars rover, mars polar lander, etc.) Let F be an n x 1 vector of failures commonly found in that 
application domain. Selecting a subset from elementary failure modes, these components are assumed to be subject 
to wear, fatigue, corrosion, fretting, and impact failure modes (Collins and Hagan, 1976). The m component vectors 
are aggregated together to form Cf , the mxn component-failure matrix, where n is the total number of failure modes 
occuring across all m components The matrix has n failure modes in its columns (representing the variables), and 


Table 1 . Component-Failure Matrix CF. 
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Table 2. PC Matrix for CF. 


0.3943 

-0.5869 

0.0425 

0.7058 

0.0000 

-0.4792 

-0.3220 

-0.5750 

0.0347 

-0.5787 

0.4792 

0.3220 

-0.7877 

0.0475 

0.2095 

0.3943 

-0.5869 

-0.0425 

-0.7058 

- 0.0000 

-0.4792 

-0.3220 

-0.2128 

0.0128 

0.7882 


Table 3. 

SC Matrix for CF. 


-0.2163 

-0.7133 

- 0.0000 

0.0000 

- 0.0000 

1.2214 

0.2527 

- 0.0000 

- 0.0000 

0.0000 

-1.0050 

0.4606 

- 0.0000 

- 0.0000 

- 0.0000 


Table 4. LAT Vector (Eigenvalues) for CF. 

1.2743 

0.3924 

0.0000 


m components in its rows (representing the various observations). Table 1 presents the aggregated component-failure 
matrix, with Us representing an occurrence of a failure for a given component, and 0 s representing non-occurrence. 
Note that the columns correspond to the failure modes (FI is wear, F2 is fatigue, F3 is corrosion, F4 is fretting, and 
F5 is impact), and the rows correspond to the components under study (Cl is a gear, C2 is a bearing, and C3 is the 
shaft.) 

Principal Axes of Variation for Design 

Let Xcf = CF r x CF/(m - 1 < be the covariance matrix of the component-failure matrix CF, an nxn symmetric 
matrix (n is the number of elemental failure modes). In this work, Principal Components Analysis (PC A) is used to 
compute the transformed variables, eigenvectors, and eigenvalues, described in the previous section. In the following, 
the PC matrix corresponds to the eigenvector matrix V, the SC matrix corresponds to the transformed vector Y, and 
the LAT vector contains the diagonal elements of the eigenvalue matrix D, which represent the eigenvalues of the 
covariance matrix of the input data. 

The input matrix CF, with m = 3 and n = 5, is defined in Table 1 . Using the centered input vector CF 0 = CF - CF, 
the PCA script in Matlab results in the principal components, scores, and latent values, shown in Tables 2, 3, 4. 

The PC matrix provides the eigenvectors of the 5 x 5 covariance matrix, providing the coefficients of the new 
coordinate system described by the principal axes, with respect to the old coordinate system described by the variables 
FI, F2, etc. The columns of this matrix correspond to each of the principal components, and the values in each 
row represent the coordinate based on the original variables Fi. The principal axes correspond to the directions with 
maximum variability, and provide a simpler and more parsimonious (low-dimensional) description of the covariance 
structure (Johnson and Wichem, 1492). The coordinate transformation is shown schematically in Figure 2 for a case 
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Figure 2. Coordinate Transformation Using PCA. 


with three variables FI, F 2, and /' 3 only. 

As an example, the first principal component can be used to describe the original variables in the transformed coor- 
dinate system as a linear combination of all five failure modes as follows: pc 1 = 0.3943F1 - 0.4792F2 + 0.4792F3 + 
0.3943F4 — 0.4792F5. Using this relationship, the designer can deduce that F 2, F3 and F 5 have a higher effect than 
FI and F4, and that F 2 & F 3 have an equal but contrasting effect on the first principal component, and so on. The 
eigenvalues of the covariance matrix are represented in the LAT vector, shown in Table 4. Note that with an eigenvalue 
of 1.27, the first principal component accounts for 76.46% of the total variance in the data, and hence is sufficient to 
represent the failure information in a simpler (more parsimonious) manner, and can be considered as a model of the 
sample data. The second principal component has an eigenvalue of 0.39, and accounts for the remaining 23.54 t of 
the variance. (There are only two eigenvalues in this case since the rank of the covariance matrix is m - 1 = 2. The 
rest of the eigenvalues belong to the null space.) 

The scores in the SC matrix provide the relative weight for the eigenvectors on each of the observations (compo- 
nents), and are computed as CF 0 > PC. The scores are then interpreted as corresponding to the pattern of the variation 
for each eigenvector over the different machinery components (C,) under study. The first column of the SC matrix 
corresponds to the first principal component, with each row corresponding to each component Cl, C2, and C3 (obser- 
vations). The second column corresponds to the second principal component. (The remaining columns belong to the 
null space, since the rank of the covariance matrix in this case was m— 1=2.) The variance of the scores for the first 
principal component (first column of SO equals the first eigenvalue (X! = 1.27), and the variance of the scores for the 
second principal component equals the second eigenvalue (X 2 = 0.39). Using this example, for the first component 
Cl (gear), the first principal mode has a weight of -0.21, whereas for the second component C2 (bearing), the same 
principal mode has a weight of 1.22, hence indicating a stronger influence on this component. 

The transformed representation of the failure information in terms of a principal mode can be used by designers 
to decide on tradeoffs in terms of failures. For example, failure modes F2, F3 and F5 have a more significant effect 
on the overall performance and quality of the product than failure modes FI and F4, as indicated by the first column 
of the PC matrix. Based on this information, the designer might want to pay closer attention to the first three modes, 
and not be as concerned with the last two modes. For example, in this case, the bearing component C2 depends more 
heavily on these three modes, as indicated by the first column of the SC matrix. 
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Figure 3. Turbine Subsystem for an OH58 Helicopter. 


APPLICATION TO THE RISK-FREE DESIGN OF LARGE SYSTEMS 

To assure a failure and risk-free product, designers make use of any information and previous knowledge about 
potential failure modes that might occur during a system’s lifecycle. In this work, we propose to reduce the load on 
the designer by concentrating on a linear transformation of failure and component data gathered from real accident 
reports, eliminating the need to sort through large amounts of data. A feasibility study is presented in this section 
using a rotorcraft system as an example, first introduced in (Roberts et al., 2002). 

Rotorcraft Failures and Functions 

The application is a Bell 206 Helicopter whose army counterpart, an OH58 helicopter, is located at NASA Ames 
for flight research purposes (Huff et al., 2002). Helicopter accident reports published by the National Transportation 
and Safety Board and NASA were carefully studied to determine the common failure modes and the components and 
subsystems affected by these failures (Harris et al., 2000; NTSB, 2001; Roberts et al., 2002). Maintenance guides, 
engineering schematics, and design specifications for this type of rotorcraft were studied thoroughly to determine the 
components and subsystems of relevance (Shafer, 1980). 

The engine and power train subsystems were identified as the primary systems where failures occurred. As an 
example, a schematic of the turbire system contained inside the engine of a Bell 206 helicopter is shown in Figure 3, 
along with a detailed description of the components contained in the assembly (Roberts et al., 2002). 29 components 
and subsystems were identified as potentially causing failures. These components had a total of 10 failure modes 


Taole 5. Components from Helicopter Accident Reports (C). 


Element 

Description 

Cl 

air discharge tubes 

C2 

bearing 

C3 

bleed valve 

C4 

bolt 

C5 

compressor case 

C6 

compressor mount 

C7 

compressor wheel 

C8 

coupling 

C9 

diffuser scroll 

CIO 

exhaust collector 

Cll 

fire wall 

C12 

front diffuser 

03 

front support 

04 

governor 

05 

housing 

06 

impeller 

07 

mount 

08 

nozzle 

09 

nozzle shield 

C20 

0 ring 

C21 

P3 line 

C22 

plasting lining 

C23 

pressure control line 

C24 

pylon isolator mount 

C25 

rear diffuser 

C26 

rotor 

C27 

shaft 

C28 

spur adapter gearshaft 

C29 

turbine wheel 


Tab e 6. Failure Modes from Helicopter Accident Reports (F). 


Element 

Description 

FI 

bond failure 

F2 

corrosion 

F3 

fatigue 

F4 

fracture 

F5 

fretting 

F6 

galling and seizure 

F7 

human 

F8 

stress rupture 

F9 

thermal shock 

F10 

wear 


reported in the accident reports. There were 1000 accident reports involving the Bell 206 helicopters, and 69 of these 
corresponded to component failures for the engine and power train. Tables 5 and 6 present the components and failure 
modes extracted from the reports (Roberts et al., 2002). 

Reduction of the Component-Failure Space for Design Use 

Using the vectors from Tables 5 and 6, the input matrix CF, with m = 29 and n = 10, is defined as in Table 7. 
With the mean vector removed, the PCA decomposition results in the principal components, scores, and latent values 


Table 7. CF matrix from helicopter failure and component data. 



shown in Table 8. 

From the PC matrix in Table 8, the first principal component can be used to describe the original variables in the 
new (transformed) coordinate system as a linear combination of all 10 potential failure modes as follows: 


pc 1 = — 0.0079F1 — 0.0438F2 4 0.8786F3 + 0.0134F4 
+ 0.0132F5 4- 0.0467F6 - 0.0131F7 4 0.1023F8 

4 0.4604F9 4 0.0308F10. ( l ) 


The first principal component is a transformed version of the original failure modes, with the coefficients indicating the 
relative significance of each failure mode. As observed, failure modes F 3, F8, and F9 have the highest contribution 
to the first principal component. The variance of the scores for the first principal component (first column of SC) 
equals the first eigenvalue in the Le\T vector (Xi = 2.40, 67.3%), and the variance of the scores for the second principal 
component equals the second eigenvalue (A2=0.73, 18.86%). Using the SC matrix in Table 8, a plot of the first score 
vector is shown in Figure 4, which shows the distribution of the first principal component over the 29 components in 
the subsystems being studied. In tnis example, components C2 (bearing). Cl (compressor wheel), C28 (spur adapter 
for gear shaft), and C29 (turbine w heel) have the highest weighting for the first principal component. 



Score 9 for 1st PC showing the distribution oyer the 29 components 



Figure 4. Distribution of Scores for 1 st PC over the 29 Components. 

Potential Uses and Benefits 

In this work, we are proposirg the decomposition provided by the PCA transformation as a tool to analyze and 
predict the effect of potential failure modes on the system being designed. In the helicopter case study above, the 
first principal component, which is a linear combination of all 10 failure modes obtained from the accident reports, 
explained 67.3% of the total variance in the CF data, followed by the second principal component which explained 
18.8% of the total variance, adding up to over 85% of the total variance in the data. The scores corresponding to these 
two principal components determine the weight of each of the 29 components under study. The principal components 
with the highest variance and their relative effects on the design components can be studied using the scores, as 
shown in Figure 4, eliminating the need to go through every component and failure combination. When starting from 
large component-failure matrices < CF) for complex systems, the decomposition provided by this method will enable 
a study of the most critical failure modes in an efficient way. Using a large database of components, systems, and 
potential failure modes, the few dominant principal components (high-variance eigenvectors) resulting from the PCA 
decomposition can be used as a “model’ of the component-failure information in the system. Any new component or 
set of components can be compared with this model to predict the severity of the potential failures. 

Consider, for example, a large component-failure matrix CF for a complex engineering system, decomposed using 
the PCA-based approach presentee here, and reduced to three principal components pci = X" a ,-Fi, pc2 = X? P,F,. and 
pc3 = X''YiF- These three eigenvectors in the transformed domain are assumed to contain the majority of the total 
variance in the original data (see derivation above.) The three eigenvectors can be stored as the model of the large 
component-failure database and used to analyze a new set of components subject to a given failure modes. Let X be 
a it x 1 vector of components under study to determine the effect of a potential failure mode F,. The projection of the 
new vector X onto the individual eigenvectors represented by pel, pc2, and pc3, computed as X x pel 7 will provide 
the relative weighting of the first principal component on the components under investigation. Such an approach will 
help designers concentrate on the components that have the highest potential of exhibiting the particular failure mode. 
A similar analysis can be carried out for a single component subject to a number of failure modes contained in the 
original database of components and failures. 



CLOSURE 

This work aims to provide design methodologies for failure and risk-free product design in complex engineering 
systems. This paper discussed an approach to reduce the dimensionality of overwhelmingly large component-failure 
information (required for the function-failure similarity analysis published previously) by means of a mathematical 
decomposition using Principal Components Analysis. The fundamentals of the method were demonstrated using a 
simple test rig example, followed by an application of the method to a case study of failures and components extracted 
from helicopter accident reports. The potential use in design was discussed in terms of providing a low-dimensional 
model of the component and failure database. The method requires a large database of all possible components and 
failure modes for a set of subsystems, either from maintenance manuals, FMEA documents, or accident reports. The 
purpose of the paper was to demonstrate initial feasibility. Further analysis of large complex systems and their failures 
is necessary to establish the value of taking such a decomposition approach to failure-free product design. A large 
anomaly/problem reporting database at NASA’s Jet Propulsion Laboratory is currently being studied for analysis using 
the methods discussed in this paper. 
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